﻿ Please refer as: Cristea,D ; Postolache,O D (2005): How to deal with wicked anaphora, in António Branco, Tony McEnery and Ruslan Mitkov (editors): Anaphora Processing: Linguistic, Cognitive and Computational Modelling, Series IV – Current Issues in Linguistic Theory, vol 263, Benjamin Publishing Books, Amsterdam/Philadelphia, pp 17-46 HOW TO DEAL WITH WICKED ANAPHORA? ,21 DAN CRISTEA1, OANA-DIANA POSTOLACHE 1 Al I Cuza University, Faculty of Computer Science, Ia i – Romania 2Romanian Academy, Institute for Theoretical Computer Science, Ia i – Romania, {dcristea,oanap}@info uaic ro Abstract The paper revises a framework (called AR-engine) capable to easily define and operate models of anaphora resolution The engine sees the linguistic and semantic entities involved in the cognitive process of anaphora resolution represented on three layers: the referential expressions layer, the projected layer of referential expression’s features and the semantic layer of discourse entities Within this framework, anaphora resolution cases usually considered difficult to tackle are investigated and solutions are proposed Among them are relations triggered by syntactic constraints, lemma and number disagreement, and bridging anaphora The investigation uses a contiguous text from the belletrist register The research is motivated by the belief that interpretation of free language in modern applications, especially those related to semantic web, requires more and more sophisticated tools 1 Introduction Although it is generally accepted that semantic features are essential for anaphora resolution, authors of automatic systems, due to the difficulty and complexity of achieving a correct semantic approach, mainly preferred to avoid the extensive use of semantic information (Lappin & Leass, 1994), (Mitkov, 1997), (Kameyama, 1997) It is well known that anaphora studies reveal a psychological threshold around the value of 80% precision and recall that seems to resist to any tentative to be surmounted by the present day systems (Mitkov, 2002) It is our belief that one of the causes for the current impasse of devising an anaphora resolution (AR) system with a very high degree of confidence should be searched also in the choice for a sub-semantic limitation Drawn mainly on strict matching criteria, in which DAN CRISTEA AND OANA-DIANA POSTOLACHE morphological and syntactic features are of great value, these systems disregard resolution decisions based on more subtle strategies that would allow lemma and number mismatch, gender variation, split antecedents, bridging anaphora or cataphora resolution Moreover, types of anaphora different than strict coreference, like type/token, subset/superset, is-element-of/has-as-element, is-part-of/has-as-part, etc often impose more complex types of decision-making, which could get down to the semantic level as well Our study makes use of the AR framework defined by Cristea and Dima (2001), and Cristea et al (2002a) (called AR-engine) with the aim to apply it for the treatment of cases of anaphora resolution usually considered difficult The AR- engine approach is settled on a view that sees anaphoric relations as having a semantic nature (Halliday & Hassan, 1976), as opposed to a textual nature This paper discusses the tractability of implementing models, within the AR- framework, capable to tackle with cases of anaphora usually considered difficult The validation of the approach is currently being done on a contiguous free text by informally appreciating the computational feasibility of the proposed solutions within the AR-engine framework The research is motivated by the belief that interpretation of free language in modern applications, especially those related to semantic web, justifies more and more sophisticated tools We think that our investigation is a step forward towards dealing with really hard anaphora resolution problems as those occurring in free texts The study intends to raise a psychological barrier relative to what is really hard to process It is our belief that the usual lack of interest for considering hard cases of anaphora in practical settings is not always motivated by high modelling and computational costs and their notoriety of “untouchables”, tacitly accepted, is exaggerated The real hard life in dealing with AR happens only when world knowledge is to be put on the table In this paper we try to prove that until there, there is still a lot to do The presentation proceeds as follows: section 2 describes AR-engine: its basic principles, the constituent parts in the definition of a model within the framework and the basic functionality of the engine put to analyse a free text Sections 3 to 7 discuss cases of AR, from more simple to more complex Finally, section 8 presents preliminary evaluation data and conclusions HOW TO DEAL WITH WICKED ANAPHORA? 2 The framework 2 1 The AR-engine1 basic principles In (Cristea & Dima, 2001), (Cristea et al , 2002a) a framework having the functionality of a general AR engine and able to accommodate different AR models is proposed This approach recognizes the intrinsic incrementality of the cognitive process of anaphora interpretation during reading a text or listening a discourse It sees the linguistic and semantic entities involved in the process of AR as settled on two fundamental layers: a text layer – populated with referential expressions (REs)2, and a deep semantic layer – where discourse entities (DEs), representations of entities the discourse is about, are placed Within such a view, two basic types of anaphoric references can be expressed: coreferences, inducing equivalence classes of all REs in a text which participate in a coreference chain, and functional references (Markert et al , 1996), also called indirect anaphora or associative anaphora (Mitkov, 2002), which express semantic relations between different discourse entities, including type/token, is-part-of/has-as-part, is-element-of/has-as-element, etc As sketched in Figure 1, chains of coreferential REs are represented as corresponding to a unique DE on the semantic layer, whereas functional references are represented as relational links between the DEs of the corresponding REs Figure 1: Representation of anaphoric relations revealing their semantic nature: text layer REa REb text layer …… …………………… REa REb semantic layer DE semantic layer………………… ……DE DEb aa a relation b a coreferences; b functional references Representations involving only REs and DEs are the result of an interpretation process applied to a text Even if the semantic level is kept hidden, these types of representations are implicitly assumed by the majority of anaphora resolution annotation tasks Indeed, DEs of the semantic layer could be short-circuited by appropriate tags associated to coreferential REs, where each RE point either to the first RE of the chain or to the most recent antecedent RE Analogously, in the case 1 AR-engine and the related documentation are freely available for research purposes at http://consilr info uaic ro 2 We will restrict in this study only to nominal referential expressions DAN CRISTEA AND OANA-DIANA POSTOLACHE of functional references the annotation tags associated to the surface REs name the nature of the referential function However, if we are interested to model the interpretation process itself, in a way that simulates the cognitive processes developed in a human mind during text reading, the need for another intermediate layer can immediately be argued for On this layer, that we will call the projection layer, feature structures (in the following, projected structures – PSs) are filled-in with information fetched from the text layer and all the resolution decisions are to be negotiated between PSs of the projection layer and DEs of the semantic layer We will say that a PS is projected from an RE and that a DE is proposed (if it appears for the first time in the discourse) or evoked (if it exists already) by a PS (Figure 2) text layer…… …………………… REa REb text layer …… …………………… REa REb RE projects PS RE projects PS RE projects PS RE projects PS bbaaaabb projection layer …………………PS PS projection layer …………………PS PS baba PSa proposes DEa PS evokes DE PSa proposes DEa PSb proposes DEb ba semantic layer ………DE semantic layer …………………… DE DEb aa a b relation Figure 2: The three-layer representation of: a two coreferring expressions; b two functional referential expressions As terminology, we denote by referential expression (RE) any noun phrase having a referential function, including the first mentions of an entity The coreference relation (two REs are coreferent if they refer to the same entity (Hirschman et al , 1997)) is, in most of the cases, anaphoric3, while not all anaphoric relations are coreferential (e g bridging anaphorae) Then, according to the usual acceptance (see for instance (Mitkov, 2002)), if REb corefers REa, with REb following REa in text, 3 For the definition of anaphoric relations we adopt a somehow different position than Deemter and Kibble (2000), for instance They argue that, following the definition of anaphora: an NP α1 is said to take an NP α2 as its anaphoric antecedent if and only if α1 depends on α2 for its interpretation (e g Kamp & Reyle, 1993), W J Clinton and Hillary Rodham’s husband, are not anaphoric since Hillary Rodham’s husband can be understood as W J Clinton by itself, therefore without the help of the former RE Our meaning for α1 depends on α2 for its interpretation is α1 and α2 are related in the given setting In this sense, the two REs above are anaphoric if the intend of the writer is to let the reader establish a link between the two mentions, in this particular case, as the same person In Cristea (2000), co-referential non-anaphoric references are called pseudo references These are REs which, although referring the same entity, can be understood independently without making the text interpretation to suffer if a relation between them is not established (for instance, two mentions of the sun: I waked up this morning when the sun rose; and later on: I read a book about Amenomphis h the IVt, the Egyptian pharaoh, son of the sun) HOW TO DEAL WITH WICKED ANAPHORA? we say that REb is the anaphor and REa the antecedent In order to stress the semantic nature of anaphora as a referential relation (Halliday & Hassan, 1976), if anaphors and antecedents remain intrinsically connected to the text, discourse entities belong to the semantic layer and are said to be the referents of REs The unique DE that is referred by a set of REs disposed in sequence reveals thus the equivalence class of these REs as a chain of coreferencing expressions Figure 3 presents a sequence of phases during the functioning of the AR-engine in which two referential expressions are found to corefer First the referential expression REa is identified on the text layer It projects down to the projection layer a features structure composed of a set of attribute-value pairs – PSa (Figure 3a) Supposing the model decides in favour of considering REa as introducing a new discourse entity during interpretation, the features structure PSa proposes an adequate semantic representation on the semantic layer – DEa, mainly a copy of PSa (Figure 3b) Because the aim of the projected structure is to help the proposal/identification of a discourse entity, once this task has been fulfilled, the projected structure can be discarded The result is a bidirectional link that will be kept between REa and the corresponding DEa Some moments later, when a referential expression REb is identified on the text layer, it projects a features structure PSb on the projection layer (Figure 3c) Finally, if the model takes the decision that PSb evokes DEa, a bidirectional link between REb and DEa is established and PSb is discarded (Figure 3d) A similar sequence takes place when other types of anaphoric relations than strict coreference are established text layer ……… ……… REa …………REa …………………REbREa …………………REbREa … … projection layer ……………………………P………………………… P Sa Sb … … … semantic ……… …………D……………… … D ……………… … D Ea Ea Ea a b c d Figure 3: a Projection of PSa from REa; b Proposing of DEa from PSa; c Projection of PSb from REb; d Evocation of DEa by PSb DAN CRISTEA AND OANA-DIANA POSTOLACHE 2 2 Definition of an AR model The AR-engine framework can accommodate different AR models Such a model is defined in terms of four components The first component specifies the set of attributes of the objects populating the projection and semantic layers and their corresponding types Different approaches in AR may lead to consider specific options for what features of the anaphor and the referent are to be considered important in the resolution process An analysis of the state of the art of the existing approaches suggests a classification of the possible features (attributes) on the following lines: a morphological features: - grammatical number; - grammatical gender; - case All known approaches use morphological criteria to filter out antecedents However, there are frequent cases when elimination of possible referential links based on mismatches of morphological features may lead to erroneous conclusions Barlow (1998), for instance, presents examples when gender concord between a pronominal anaphor and a common noun antecedent seems to be unobserved ( Su Majestad suprema… él4, in which the antecedent is a feminine NP and the anaphor – a masculine pronoun; in English his supreme Majesty… he, displays no such problem because English nouns do not have genders) Also, most languages acknowledging gender distinction have a number of nouns or phrases that can be referred to by both masculine and feminine pronouns, according to the natural gender of the person designated ( le docteur… elle; in English the doctor… she) Though we do not share Barlow’s view in this respect, namely that morphology should be ignored in AR, a less categorical approach with respect to a filtering rule based on morphology is preferable b syntactical features: - full syntactic description of REs as constituents of a syntactic tree (Lappin and Leass, 1994), (Hobbs, 1978); 4 In Spanish: supreme Majesty (feminine noun) … he HOW TO DEAL WITH WICKED ANAPHORA? - marking of the syntactic role for subject position or obliqueness (the subcategorisation function with respect to the verb) of the REs, as in all centering based approaches (Grosz et al , 1995), (Brennan et al , 1987), syntactic domain based approaches (Chomsky, 1981), (Reinhart, 1981), (Gordon & Hendricks, 1998), (Kennedy & Boguraev, 1996); - quality of being adjunct, embedded or complement of a preposition (Kennedy & Boguraev, 1996); - inclusion or not in an existential construction (Kennedy & Boguraev, 1996); - syntactic patterns in which the RE is involved, that can lead to the determination of syntactic parallelism (Kennedy & Boguraev, 1996), (Mitkov, 1997); - the quality of being in an apposition or a predicative noun position c lexico-semantic features: - lemma; - person5; - name (for proper nouns); - natural gender; - the part-of-speech of the head word of the RE The domain of this feature contains: zero-pronoun (also called zero-anaphora or non-text string), clitic pronoun, full-flagged pronoun, reflexive pronoun, possessive pronoun, demonstrative pronoun, reciprocal pronoun, expletive “it”, bare noun (undetermined), indefinite determined noun, definite determined noun, proper noun (name)6; - the sense of the head word of the RE, as for instance, given by a wordnet7; - position of the head of the RE in a conceptual hierarchy (hypo/hypernymy) as in all models using wordnets (Poesio et al , 1997), (Cristea et al , 2002a) Features as animacy, sex (or natural gender) and concreteness 5 Since, among the nominal REs, only pronouns can distinguish the person, for our purposes person is a lexical feature 6 As mentioned already, this classification takes into account only nominal anaphors, therefore ignoring verbal, adverbial, adjectival, etc (Mitkov, 2002) 7 We prefer to use wordnet as a common noun when we refer to any language variant (Vossen, 1998), (Tufi & Cristea, 2002a) of the original American English WordNet (Miller et al , 1993) DAN CRISTEA AND OANA-DIANA POSTOLACHE could be considered simplified semantic tags derived from a conceptual hierarchy; - inclusion in a wordnet synonymy class; - semantic roles, out of which selectional restrictions, inferential links, pragmatic limitations, semantic parallelism and object preference can be verified d positional features: - offset of the first token of the RE (an NP) in the text (Kennedy & Boguraev, 1996); - inclusion in an utterance, sentence or clause, considered as a discourse unit (Azzam et al , 1998), (Cristea et al , 1998) This feature allows, for instance, calculation of the proximity between the anaphor and the antecedent in term of the number of intervening discourse units e other features: - inclusion or not of the RE in a specific lexical field, dominant in the text (this is called “domain concept” in (Mitkov, 1997)); - frequency of the term in the text (Mitkov, 1997); - occurrence of the term in a heading (Mitkov, 1997) The second component of a model is a set of knowledge sources intended to fetch values from the text to the attributes of the PS A knowledge source is a virtual processor able to fill in values for one single attribute on the projection layer Depending on the application the AR-engine is coupled to, as well as on the format of the input, sometimes more than just one such virtual processor could be served by one NLP processor Thus, a morpho-syntactic tagger usually serves several knowledge sources as it can provide at least lemma, grammatical number and gender, case, person and part of speech of the head word of the RE (Brill, 1992), (Tufi, 1999) An FDG (functional dependency grammar) parser (Järvinen & Tapanainen, 1997) fetches the syntactic role of the RE, while wordnet access functions can bring all the headword senses (or synsets), and their position in a conceptual hierarchy If word sense disambiguation (WSD) is available as a knowledge source, then the exact word sense of the headword in the corresponding context can be determined The membership of an RE to a certain segment can be the contribution of a discourse segmenter or a syntactic parser HOW TO DEAL WITH WICKED ANAPHORA? The third component is a set of matching rules and heuristics responsible to decide whether the PS corresponding to an RE introduces a new DE or, if not, which of the existing DEs it evokes This set includes rules of the following four types: - certifying rules , which, if evaluated to 'true' on a pair (PS, DE), certify without ambiguity the DE as a referent of the PS For instance, coreference based on proper name identity could be implemented, in most application settings, by a certifying rule; - demolishing rules , which rule out a possible DE as referent candidate of a PS (and, therefore, of its corresponding RE) These rules lead to a filtering phase that eliminates from among the candidates those discourse entities that cannot possibly be referred to by the RE under investigation The order of application of certifying and demolishing rules is specified in the model through priority declarations; - promoting/demoting rules (applied after the certifying and demolishing rules), which increase/decrease a resolution score associated with a pair (PS, DE) The evaluation of these rules allows the run of a proposing/evoking phase, in which either the best DE candidate of a PS is chosen from the ones remained after the demolishing rules have been applied, or a new entity is introduced The use of promoting/demoting rules can be assimilated with the preferences paradigm, employed by many classical approaches; - a special section of the third component is dedicated to attribute filling rules , which are activated each time a new DE is proposed These rules, behaving similar to the certifying ones, are responsible for the setting of anaphoric relations of a functional type Each such rule receives as parameters: the name of an attribute (a functional relation), and a pair (DE1, DE2), in which DE1 is the current DE and DE2 is a DE previously introduced If a matching is verified, that attribute of DE1 mentioned as the rule’s first parameter, receives as value the identifier of DE2 Finally, the fourth component is a set of heuristics that configure the domain of referential accessibility, establishing the order in which DEs have to be checked, or certain proximity restrictions For instance, if we want to narrow the search for an antecedent to a vicinity of five sentences (or discourse units) with the intent to reduce the resolution effort on the base that the great majority of the anaphors can find an antecedent within this range, e g (McEnery et al , 1997), then the fourth DAN CRISTEA AND OANA-DIANA POSTOLACHE component of the model will record that only those DEs linked with REs belonging to the last five discourse units are considered Not the least, the domain of referential accessibility can model a linear search back order (Mitkov, 2000), or a hierarchical search back order on the discourse tree structure Figures 4 and 5 display an example of a domain of referential accessibility for the linear case, respectively the hierarchical case Figure 4a shows a case when REa evokes DEa and REb evokes DEb Then the order to search the candidate referents for PSc (projected from REc) is DEb first, then DEa If a match between PSc and DEa is found (Figure 4b) then, for a subsequent REd, the order to search the candidate referent matching the correspondent PSd is DEa first, then DEb (Figure 4c) If, instead, hierarchical order is preferred, considering that REa, REb and REc belong to three adjacent discourse units whose vein structure (Cristea et al , 1998, 2000) is the one depicted in Figure 5 in bold line8, then the order to consider the candidate referents for PSc (projected from REc) is DEa first and DEb after, since, hierarchically, REa (and therefore its corresponding DEa) is closer to REc than REb (and its corresponding DEb) In certain cases it could be of help to see the domain of referential accessibility as dynamically scaled on the type of the anaphor A synthesis done by Mitkov (2002, p 24) evidences that demonstrative anaphors find their antecedents more distantly than pronouns, while this distance could be even greater in the case of definite nouns and proper nouns Rules of this kind could be included in the forth component of the AR-engine The framework is language independent, in the sense that the adjustment to one language or another consists in defining a specific set of attributes, establishing the language specific knowledge sources capable to fill them and devising evoking heuristics/rules specific to each language The domain of referential accessibility is thought to be stable to language change 8 The vein expression of an elementary discourse unit ( edu) u, following Veins Theory, is a sequence of edus, proceeding, including and following u, which account to the minimal coherent sub- discourse focused on u The bold lines in Figure 5 exemplify a situation in which REb, the linearly most recent RE from REc, is short-circuited by the vein expression of the edu REc belongs to, which means that REa is more pregnant in the reader’s memory than REb when REc is read HOW TO DEAL WITH WICKED ANAPHORA? text layer ……… ……………………… REb REa REc ……………………… REbREa REc ……………………… REbREa REc REd … … … layer……………………… …projection PSc ……………………… …PSc ……………………… …PSd … semantic layer ……………… …DE DE ……………… …DE DE ……………… ……… DEDE ababab a b c Figure 4: Linear search order text layer ……… ……………………… REcREa REb … layer ……………………… …projectionPS c … semantic ……………… ……… DE DE ba layer … Figure 5: Hierarchical search order 2 3 Processing anaphors with AR-engine Figure 3 depicts the main processing stream of AR-engine The fundamental assumption is that anaphors should be resolved in a left-to-right order (in left-to-right reading languages) and vice versa in right-to-left reading languages This way, the linear processing done by humans while reading, from the beginning of the text to its end is mimicked At any moment during processing just one RE is under investigation, which we will call – the current RE As the current RE is, momentarily, the last one on the input stream, all resulting activity is performed against DEs already existent and, therefore, all found relations will point towards the beginning of the text One processing cycle of the engine deals with the resolution of one RE and develops along three compulsory phases and an optional one The first (mandatory) phase is the projection phase, when a PS (called the current PS) is build on the projection layer, using the information centred on the current RE obtained from the text layer with the contribution of the available knowledge sources The second (mandatory) phase, proposing/evoking, is responsible for matching the current PS towards one DE, either by proposing a new discourse entity or by DAN CRISTEA AND OANA-DIANA POSTOLACHE deciding on the best candidate from the existent ones This process involves first running the certifying and demolishing rules (if available), followed by the promoting/demoting rules In the end, either an existent DE is firmly identified by a certifying rule, or matching scores between the current PS and a class of referent DEs are computed Based on these scores, three possibilities can be judged: 1 all candidate DEs range under thresholdm, a parameter of the engine in the in range 0 to 1: the interpretation is that none of the preceding DEs is sufficiently convincing as a referent for the current RE, and therefore a new DE is build Each time a DE is created, a relation ( type-of, is-part-of, etc ) is searched for between the new DE and previous DEs in a certain length window Responsible for this activity are the attribute-filling rules ; 2 the best rated scores are above thresholdm, but in the thresholdd iniff range (a parameter usually less than 0 1) more than one candidate is placed: this situation should be interpreted as a lack of enough evidence to firmly consider one referent (the one scored the best) as the selected candidate Consequently, the decision to choose a referent is postponed in order to allow following resolutions to bring supplementary clues to the resolution of the current RE, and the postponed corresponding PS is left on the projection layer; 3 the best score rated above thresholdm and there is no other score under it in in the thresholddiff range: the interpretation is that the corresponding candidate individualises itself strongly among the rest of DE candidates It will be confirmed as the referent and any of the preceding REs of the current RE, which correspond to the identified DE, should be considered antecedents of the current RE In the third compulsory phase, the completion phase, the data contained in the resolved PS is combined with the data configuring the found referent, if such a DE has been identified, or, simply, the PS content is copied onto the newly build DE if none of the already existing DEs has been recognised The resolved PS is afterwards deleted from the projection layer since any information that it used to capture can now be recuperated from the DE So, to give an extreme example, if for some reason a model chooses to look for previous syntactic patterns of chained REs, they can be found on the semantic level Although apparently contradictory to the “semantic” significance of the layer, this behaviour can mimic the short-term memory that records information of value for immediate anaphoric resolution Finally, the optional re-evaluation phase is triggered, if postponed PSs remained on the projection layer at a former step The intent is to apply the matching rules HOW TO DEAL WITH WICKED ANAPHORA? again on all of them Humans usually resolve anaphors at the time of reading, but sometimes decisions should be postponed until the acquisition of complementary information adds enough data to allow a disambiguation process Cases of postponed resolution will be discussed in section 7 2 At the end of processing, each RE should record a link towards its corresponding DE and each DE should record a list of links towards its surface REs As we shall see in sections 3 to 6, when referential relations different than strict coreference, are to be revealed, DE attributes, which are not directly triggered from the corresponding PSs, appear as necessary As mentioned at item 2 of the proposing/evoking phase, a section dedicated to actions to be performed for the filling-in of specific attributes following a proposing action is opened in the third component of the framework – the one dedicated to rules and heuristics In the examples to follow we will mark REs by italic letters (as a car ) and their corresponding DEs by a paraphrasing text in bold fonts and within square brackets (as [ the car]) The following sections will analyse, within the AR-engine framework, a set of AR cases, usually considered difficult to interpret The discussion intends to evidence specific difficulties inherent to a large range of anaphoric phenomena, to imagine solutions in terms of an AR model, by naming knowledge sources and rules/heuristics capable to deal with the identified tasks and to informally appreciate the tractability of these solutions The discussion remains under the universal panacea for all the failures in AR, world knowledge (WK) 3 Relations triggered by positional and/or syntactic constrains 3 1 Nested referential expressions (1) the University building h (2) Amenomphis the IVt's wife (3) the face of the beautiful queen In constructions of these types, two included (nested) REs are involved They refer two distinct DEs, which are linked by a certain relation In (1), the two DEs are [the University building] and [ University], where [ the University building] h belongs-to [University] In (2), between [ Amenomphis the IVt's wife] and h [Amenomphis the IVt] a variant of the belongs-to relation holds, perhaps a commitment In (3), between [ the face of the beautiful queen] and [ the DAN CRISTEA AND OANA-DIANA POSTOLACHE beautiful queen] a still different type of belongs-to relation holds, perhaps a is-part-of relation In all cases, the possessed object (or the part) corresponds to the outer RE while the possessing entity (or the whole) corresponds to the inner RE on the surface string The incremental type of processing, including surface string parsing, and the included pattern of the REs allow that processing of the possessing entity (corresponding to the inner RE) be performed before the possessed entity (corresponding to the outer RE) If RE1 is nested on RE2 on the text layer, a knowledge source should fetch the value RE1 to a nesting slot of the PS corresponding to RE2 On DE2 of the semantic layer, this slot will later on be transformed, by an attribute-filling rule, into a belongs-to (or some variation of it) attribute indicating the DE corresponding to RE1 Other constructions where a belongs-to or variations of it are correctly included are9: ( the center of (the hall opposite the big telescreen)), ( emblem of (the Junior Anti-Sex League)), ( one of (the middle rows )), ( one of (them)), ( one of (the novel-writing machines )) In some cases the rule should be applied recursively: ( the waist of (( her ) overalls )), ( the shapeliness of (( her ) hips )) However in expressions like: ( the hall opposite (the big telescreen)), (preparation for (the Two Minutes Hate)), ( some mechanical job on (one of the novel-writing machines )), ( a bold-looking girl, of (about twenty-seven)), ( the girl with (dark hair )), the relation between the two constituents are different than belongs-to or its variations Our refinement of the types of relations to consider did not go so far Moreover, a demolishing rule should always prevent a coreference relation between the DEs corresponding to the two REs 3 2 Apposition (4) Mrs Parsons, the wife of a neighbour on the same floor h (5) Nefertiti, Amenomphis the IVt's wife (6) Jane, beautiful girl, come to me! (7) a sort of gaping solemnity, a sort of edified boredom An apposition usually brings supplementary knowledge on a discourse entity Also according to other approaches (Mitkov, 2002), but in disagreement with the annotation convention of MUC-7, which sees the apposition as one RE and the pair of the two elements as another RE, we consider the two elements of the apposition as different REs In the model that we have built, the type of relation linking the two 9 From G Orwell’s “1984” HOW TO DEAL WITH WICKED ANAPHORA? REs obeys the following heuristic: definite determined NP, genitival appositions and undetermined NP, as in (4), (5) and (6) yield coreferences, whereas indefinite noun appositions as in (7) yield type-of relations between the DE corresponding to the second RE towards the DE corresponding to the first RE Let RE2 be an apposition of RE1 on the text level We will suppose a knowledge source capable to apply syntactic criteria in order to fetch an apposition-of=RE1 slot attached to PS2 As PS1 should have matched a DE1 the moment PS2 is being processed, a certifying rule must unify PS2 with DE1, in case RE2 is a definite determined NP, undetermined NP or a genitival construction As a result, DE1 will accumulate all the attributes of PS2 Examples of cases correctly interpreted following this strategy are10: ( Emmanuel Goldstein), (the Enemy of the People); ( the primal traitor ), (the earliest defiler of the Party's purity) If the apposition is an indefinite determined NP, a demolishing rule will rule out as a possible antecedent the argument of the apposition-of attribute in the current PS As a consequence, the usual proposing/evoking mechanism will work, finalized in finding a target DE Then, only if the found DE is new, a rule in the attribute-filling section of the set of rules/heuristics will exploit the apposition-of=RE1 slot attached to PS2 in order to transform it into a type-of=DE1 value This strategy will correctly interpret an apposition like ( a narrow scarlet sash), (emblem of the Junior Anti-Sex League) Unfortunately, the knowledge source responsible to detect appositions can easily go into errors This is the case when apposition is iterated over more than just two adjacent constituents: ( the most bigoted adherents of the Party), (the swallowers of slogans ), (the amateur spies ) and (nosers-out of unorthodoxy); ( a man named O'Brien), (a member of the Inner Party) and (holder of some post so important and remote), where clear criteria to disambiguate from enumerations or from indications of locations (as in ( the same row as Winston), (a couple of places away)), the only two types of exceptions found so far matching the patterns of our apposition-finding knowledge source, are difficult to device 3 3 The subject – predicative noun relation (8) Maria is the best student of the whole class (9) John is a high school teacher (10) Your rival is a photo (11) The young lady became a wife 10 From G Orwell: “1984” DAN CRISTEA AND OANA-DIANA POSTOLACHE Supposing RE1 is the subject and RE2 is the predicative noun, a knowledge source of a syntactic nature should be able to fetch a predicative-noun- of=RE1 attribute into the PS2 corresponding to the predicative noun RE2 Definite determined predicative nouns as the best student of the whole class in (8) are, in our model, considered coreferential with the subject The resolution should aim at injecting into the DE [ Maria] the information brought by the predicative noun RE2, and temporarily stored on PS2 Suppose the DE [ Maria] is something of the kind: [name="Maria", sem=person1, Ngen=fem, num=sg], where person1 is the first sense of the word person according to WordNet Then, the fact that she is seen now also as a student must not affect any of the attributes name, Ngen (natural gender) or num (grammatical number) but instead add into the description an attribute lemma=student (if only the head of the RE is considered in the representation, or a more sophisticated description if the constituents are also kept: the best of the whole class ), and replace the person1 value of the sem attribute with a more specific one: student111 When the predicative noun is an indefinite NP, as in (9), our model interprets it as the semantic type of the subject The more general concept is replaced with a more specific one both when a concept is predicated as a more specific one ( the animal is an elephant ) as well as when the reverse predication holds ( the elephant is a heavy animal with a trump) Other examples of the same kind are12: ( one of them) was (a girl ); ( she) was (a bold- looking girl, of about twenty-seven); ( who) were (the most bigoted adherents of the Party); ( the other person) was (a man named O'Brien); ( O'Brien) was (a large, burly man); ( she) might be (an agent of the Thought Police)13 Conceptual hierarchies like WordNet can help to identify, in examples like (10), that a photo (an object) cannot be a type for [ the rival ] (hyponym of a person, according to WordNet) On the contrary, to find out that a photo is a substitute for the person faced in the photo necessitates deep WK To offer a substitute of a solution in cases like that, a generic relation like metaphoric-type-of can be adopted The solution we adopted for representing discourse entities subject to time changes, different than the one proposed in MUC-7 (Hirschman & Chinchor, 1997), is described in (Cristea & Dima, 2001): we have linked entities as the ones in 11 The implicit assumption here was that WSD capabilities were used as a knowledge source 12 From G Orwell’s “1984” 13 The present model does not implement specific criteria to deal with modalities HOW TO DEAL WITH WICKED ANAPHORA? example (11) with the same-as relation, triggered by the occurrence of the interposed predicate become In all cases (8) to (11) a complication arises when the resolution of RE1 (the subject) was postponed the moment RE2 (the predicative noun) is processed14 If this happens, either the unification makes PS2 coreferential with the postponed PS1, or the semantic relation is established between the current proposed DE and the postponed PS1 Later on, when the postponed PS is lowered at the semantic level, these relations are maintained 4 Lemma disagreement of common nouns 4 1 Common NPs displaying identical grammatical number but different lemmas h (12) Amenomphis the IVt's wife … the beautiful queen The discovering of the coreference relation in this case should mainly be similarity-based In principle, a queen should be found more similar to a wife then to a pharaoh, supposing Amenomphis is known to be as such If, instead, this elaborate knowledge is not available, and all that is known about Amenomphis, as contributed by a name-entity recogniser knowledge sourse, is his quality of being a man, the moment the beautiful queen is processed, a queen should again be found more similar to a wife than to a man Many approaches to measure similarity in NLP are already known and some use wordnets (e g (Resnik, 1999)) When a sense disambiguation procedure is lacking, then a wordnet-driven similarity that counts the common hypernyms of all senses of the two lemmas could be a useful substitute in some cases15 Still, criteria to decide similarity are not elementary and a simple intersection of the wordnet hypernymic paths of the anaphor lemma and the candidate antecedent lemma often does not work The following is an example of a chain of erroneous coreferences found on the basis of this simplistic criteria: the centre of the hall opposite the big telescreen | his place | some post so important 14 The same is true for apposition 15 There is good reason to believe that such an approach is successful when lexical ontologies, as fine graded in word senses as WordNet, are used This criterion is based on the assumption that senses displaying common ancestors must be more similar than the ones whose hierarchical paths do not intersect DAN CRISTEA AND OANA-DIANA POSTOLACHE and remote | the back of one's neck | a chair | places away | the end of the room | the protection of his foreign paymasters16 Sometimes, a useful criterion for the identification of coreferential common noun REs with different lemmas could be the natural gender ( queen and wife are both feminine in natural gender) In other cases the antecedent could be recuperated by looking at the modifiers of the head nouns Consider example (13): (13) the most beautiful women… those beauties A promoting rule should be able to confront the lemma beauty with modifiers of the head women in the DE for [ the most beautiful women] 4 2 Common NPs with different grammatical number and different lemmas (14) a patrol … the soldiers (15) the government… the ministers According to WordNet, in two out of three senses, a patrol is a group and, in one sense out of 4, government is also a group This suggests to fill-in a sem=group feature if the group, grouping (any number of entities (members) considered as a unit) synset is found on a hypernymic path of the lemma of a candidate antecedent of the plural NP (see examples (14) and (15)) However this criterion could prove to be weak because many words have senses that correspond to groups ( a garden, for instance, has a sense that means a group of flowers, and in a text like A patrol stopped by the garden The soldiers… there is high chance to find the soldiers coreferring [ the garden] instead of [ the patrol ] Different criteria should be combined to maximize the degree of confidence, among which a similarity criteria, for instance based on wordnet glosses (as in forest – the trees and other plants in a large densely woodened area) or on meronymy, (as in flock – a group of sheep or goats – HAS MEMBER: sheep – woolly usu horned ruminant mammal related to the goat), or even the simple identification of antecedents within a fixed collection of collective nouns, as suggested in (Barbu et al , 2002) In principle, this case is similar to the preceding one if an attribute of being a group is included in the representation of the DE referent 16 From G Orwell’s "1984" HOW TO DEAL WITH WICKED ANAPHORA? 4 3 Common nouns referring proper nouns (16) Bucharest… the capital There are no other means to solve this reference than enforcing the labelling of Bucharest , in its corresponding DE, the very moment when it is processed, with, for instance, a city1 value of a sem attribute If this labelling information is available, fetched by a name-entity recogniser, then the framework processes the reference the same way it does with common nouns with different lemmas 5 Number disagreement 5 1 Plural pronouns identifying split antecedents (17) John waited for Maria They went for a pizza Despite the opinion of other scholars on the matter (see, for instance, (Eschenbach et al , 1998)) we do not think that, during the interpretation of (17) above, a discourse entity for the group [ John, Maria] must have been proposed, as soon as the referential expression Maria is parsed Or else, we have to face a very uncomfortable indecision regarding what groups to consider and when The mentioned group is seen as a DE only because at a certain moment, as the text unfolds, an anaphor coreferring it appears: they In (18) below, there is no need for such a group representation, as the reader is perhaps not conscious of its existence: (18) John waited for Maria He invited her for a pizza Neither vicinity in the location space of the story, nor textual vicinity or framing in a wording pattern are a sufficient constraining criteria for proposing groups on the semantic layer, see examples (19) and (20): (19) John was in New York when Maria wrote him that she finally made up her mind They got married the next month (20) John finished his classes He went to a football match As it was a rainy day, no more than 10 people were on the stadium Maria happened to be there too They went for a pizza and one month later got married To make life even harder, note that in (20) 12 people are candidates for different groups of persons ([ John, 10pers ], [ 10pers, Maria], [ John, 10pers, Maria], [John, Maria] or only [ 10pers ]) Nevertheless, the reader has no difficulty to identify they with the group [ John, Maria] But why not to attach to the group also DAN CRISTEA AND OANA-DIANA POSTOLACHE [John's classes ], [ the football match], [ the rainy day] or [ the stadium]? The obvious WK-based answer is: because none of the others can go for a pizza! And also because getting married is an occupation for exactly two people! But this is deep WK and, as agreed, we would not want to rely on it From the discussion above we know that group formation is triggered by a first reference to it A group, unless it is verbalised as such in the text, does not exist until it is referred to Still, two questions remain: how much we can do in the absence of WK for the group content identification, and what are the criteria to trigger the creation of group DEs, therefore by what means a plural pronoun is considered as referring a group The answer to the first question stays again in the use of similarity measures (common association basis in (Eschenbach et al , 1998)) to identify members of groups in the text preceding the plural pronoun As for the second question, the framework policy is to propose new DEs when no match between the current PS and the preceding DEs rises above thresholdm This policy is good in enough for our purpose as long as no plural DEs, toward which the plural anaphor could match, are in the recent proximity If an ambiguity arises, then the second framework policy to postpone resolution until sufficient discrimination criteria leaves a unique candidate within a thresholdd range is well suited again The iff combination of these two policies in example (21) below, for instance, would maintain the indecision whether they should corefer [ John, Maria] or [ the classes ] as long as no WK is available to state that only people can go for a pizza, and this should be a correct behaviour (21) John waited for Maria when the classes were over They went for a pizza 5 2 Plural nouns identifying split antecedents Supplementary to the problems identified above, when the anaphor is a noun, the similarity criteria found to characterize the group should extend to the anaphor as well Consider the following example: (22) Athos, Porthos and Aramis … the musketeers The similarity criteria sketched above yields person, individual, someone, somebody, mortal, human, soul – (a human being) as the WordNet concept characteristic to the discovered group, while the word musketeer means also a person As such, there is enough evidence to conclude that a DE [ the musketeers ] should be proposed, pointing to each of the DEs [ Athos ], [ Porthos ] and [ Aramis ] as members As already discussed in section HOW TO DEAL WITH WICKED ANAPHORA? 2 3, the decoration of existing DEs with attributes different than those inherited from the PS it evolves from, in our case the completion of the DE [ the musketeers ] with an attribute has-as-element= , with x, y, z being identifiers of the DEs [ Athos ], [ Porthos ] and [ Aramis ], is an action characteristic to the attribute- filling rules 6 Bridging anaphora 6 1 Elements-to-set references (23) all the weapons for the underwater hunting… the masque… the rifle… the ribbon paws In this example, to each of the REs the masque, the rifle, and the ribbon paws must correspond a proper DE Moreover, in a proper representation, each of them must contain an attribute is-element-of pointing to the DE [ the weapons for the underwater hunting] The rifle against [ the weapons…] is the only relation of this kind that can be easily inferred based on a similarity computation A masque and a paw are not in themselves weapons, although the context helps to acquire this interpretation Only reasoning on deep WK would allow for such assignments If, however, this kind of WK is available, assigning the is-element-of links from all component DEs towards the DE [ the weapons…] is also an action characteristic to the attribute-filling rules Suppose now a case in which, between two coreferring anaphors, a set the corresponding entity belongs to is mentioned, like in John and Mary decided they should go to the party, in this order: Mary first, John after (only John and the group mentioning pronoun are underlined, although the same is also true for Maria) The is-element-of relation between the element DE and the group DE cannot be established because the element DE is build before the group DE [John] is build before the group identified by they=[John and Mary] However, this relation can be inferred as the inverse of an already acquired has- as-element relation, supposed to have been filled between the group DE and the element DE the moment the group was mentioned, and on the bases of a genuine coreference relation established between the second mention of the element and its corresponding DE representation 6 2 Hidden discourse entities DAN CRISTEA AND OANA-DIANA POSTOLACHE (24) When I got into the room I saw a strange screen saver on the big monitor The other computer was off Interesting debates could arise around this example Any human person reading this text is aware of the existence of two computers in the mentioned room: one with a big screen, on which a strange screen saver was running, and another one which was off One question is whether both computers should be represented on the semantic layer or only [ the other computer] Since the mentioning of the other computer doesn't make sense, but if [ some (first) computer] exists, this can be taken as an implicit mentioning of the first computer However there is no RE in the text explicitly referring this DE, excepting from the big monitor , which is interpreted as part of this computer But a representation for a [ some (first) computer] entity cannot appear the moment the strange screen, a part of it, is mentioned, because otherwise we see no reason why to consider only the is-part-of relation and to neglect others like made-of, spatial relations like laying-upon, etc There is no end to describe all objects to which a certain mentioned object could consciously interact For instance, in some reader's mind at least the image of a table on which one or both computers lay is present A saver solution (at this level of automatic reasoning which is insinuated by our framework) is to consider as candidates for being represented on the semantic layer strictly those objects that are explicitly mentioned in the text If a more elaborated resolution model is to be attached on top of the work performed by the AR-engine, then those hidden DEs should be put into evidence through an inference mechanism, which is not supported by the current level of processing What the engine would have to do in the case of example (24) is to build a DE corresponding to the RE the big monitor and another DE for the RE the other computer No relations link these representations17 On the contrary, in a sequence like the one in example (25) the DE [ the computer] should display a has-as- part relation towards the DE [ the big monitor] (25) When I got into the room I saw a strange screen saver on the big monitor The computer was left open by my colleague In this example an attribute-filling rule must be responsible for filling-in a value of a has-as-part attribute The difference between examples (24) and (25) is that in (24) the method should prevent from retaining as the value of the attribute has- 17 Behavior as sophisticated as simultaneously projecting two PSs from an RE as the other x, which would allow for the identification of two objects of the type [ x], is not currently implemented in the AR-engine framework HOW TO DEAL WITH WICKED ANAPHORA? as-part of the DE [ the other computer] the identifier of the DE [ the big monitor], while in (25) is should mainly go for it 7 The resolution moment 7 1 Resolution in the case of cataphora A rather controversial anaphoric phenomenon is cataphora, which is said to arise “when the reference is made to an entity that is mentioned subsequently in the text” (Mitkov, 2002) In our terms, a cataphoric relation is given by a pair of coreferring mentions in which the first one introduces the referent and is information-poorer than the subsequent one The only cases that merit a special attention are those defined as ‘first-mention’ cataphora (Mitkov, 2002) or ‘backwards anaphora’ (Carden, 1982), like the one in the following text placed at the beginning of O Wilde’s ” The Picture of Dorian Gray”: (26) “From the corner of the divan of Persian saddle/bags on which he was lying, smoking, as was his custom, innumerable cigarettes, Lord Henry Wotton could just catch the gleam of the honey-sweet blossoms of a laburnum…” In cases where a pronoun precedes a noun but the text contains an earlier more informative mention of the same entity, also in accordance with other scholars (see, for instance, an analysis done by Tanaka (2000)), the pronoun should be resolved against the preceding text as in ordinary anaphora DAN CRISTEA AND OANA-DIANA POSTOLACHE t t 00t t t 101time time time text layer ……… …………… he ………………………………………he Lord Henry Wotton ………………………………………he Lord Henry Wotton … … gender=masc gender=masc projection layer …………… ……………………… number=sg …………… ……………………… …………… …number=sg sem=person sem=person name=Lord H W gender=masc gender=masc gender=masc number=sg layer …………… number=sg …………… …… number=sg ……………… …… semanticsem=person … sem=person sem=person name=Lord H W a b c Figure 6: A cataphora resolution example The view we have on this topic is that once a linear processing model, from the beginning of the text to its end, is adopted, when reading the cataphoric referential expressions, there is no way in which one would look towards the end of the text in order to recuperate a referred entity Consequently, the moment the pronoun is read/processed, a poorly decorated discourse entity must have been introduced into the state of mind of the reader, and subsequent coreferring expressions evoke this entity, eventually adding new features to it As remarked on section 2 3, the linear (incremental) processing hypothesis also implies that the anaphoric relation should always be projected on the text axis towards the beginning of the text At the moment of reading/processing the pronoun he in the example above, first a PS is projected Then this is immediately lowered to the semantic layer as a proposed DE This moment is marked t0 on Figure 6a, and the corresponding semantic representation could not contain more features than those contributed by genuine morphology (gender and number) and a semantic feature of being a person As the text unfolds, at a later moment t1, Lord Henry Wotton is processed and a PS containing morpho-semantic features, as suggested by Figure 6b, is proposed As this features structure strongly matches (in gender, number and sem) the previously created DE, the evoking phase will most probably indicate it as the referent Then, during the completion phase, the name feature will enrich the original DE, introduced by the pronoun (Figure 6c) HOW TO DEAL WITH WICKED ANAPHORA? 7 2 Postponed resolution The mechanism of postponed resolution that AR-engine incorporates allows solving of otherwise intractable cases Consider example (24): (27) No one knew who was the driver who drove up the actor home that night Later on, everybody found out that this was the best driver Hollywood ever had The moment this is to be resolved, there is no sufficient knowledge in order to decide whether it refers [ the person], or [ the actor] or even [ that night ] However, immediately after the predicate noun the best driver is read, two things happen: a) the best driver is found to refer [ the driver], a DE already introduced, and b) the predicative noun should corefer the subject (see section 3 3) So, from the fact that the predicative noun the best driver is coreferential with the subject this , and the same the best driver is resolved against the DE [the driver], it can be inferred the recognition of this as the same DE [ the driver] This is a postponed resolution and its completion is realised during the re-evaluation phase of the RE following it on the text level, as discussed in section 2 3 In example (28), application of the same mechanism produces the recuperation of this as [ the actor]: (28) No one knew who was the driver who drove up the actor home that night And when you think that this was the actor that used to be in vogue not long time ago… 8 Final considerations 8 1 Evaluation To evaluate the proposed solutions we have used four chapters, summing approx 17,500 words, from the original English version of novel “1984” by George Orwell The choice of a text belonging to the belletrist register, instead of the scientific or technical register, was justified by the intend to appreciate how frequently occur the mentioned cases in a free text and also how well are fitted the proposed solutions for the wide variety of types of referential expressions and anaphoric phenomena encountered there The text was first POS-tagged, then FDG-tagged and then manually annotated, by a group of master students (by using the Palinka annotator (Orăsan, 2002)), for coreference The annotation task did not contain a phase dedicated to markables, as DAN CRISTEA AND OANA-DIANA POSTOLACHE they were extracted automatically from the FDG structure (all structures dominated by a head noun, from which clauses were removed) NP heads were also automatically marked Our markables generally are conformant with the MUC-7 criteria (Hirschman & Chinchor, 1997), although ours do not include relative clauses, each term of an apposition is taken separately, and we have marked also wh- noun phrases Some errors that the FDG parser makes and which inflict on the NP annotation were manually corrected Four approximately equal parts were assigned to teams of two master students in Computational Linguistics The students had to annotate their assigned parts individually To simplify the annotation task, the annotators were instructed to mark only coreference relations Agreements between pairs of annotators were lying in the range 60% to 90% After seeing the mismatches reported by a program, they had to negotiate common decisions The document obtained after merging the final negotiated versions was considered the gold standard To perform the evaluation, all cases of belongs-to, type-of, is- part-of, has-as-part, is-element-of, has-as-element and same-as relations were collected manually The model implemented at this stage of research was rather a simple one, since our focus was not so much on refining the coreference performance towards attaining or surmounting the 80% psychological limit, as to see whether feasible solutions for the investigated cases of wicked anaphora can be imagined As such, the incorporated AR model contained only the following attributes: lemma, number, pos, femaleName (YES, if lemma is a female name), maleName (YES, is lemma is a male name), familyName (YES, if lemma is a family name), HeSheItThey (the probability of a noun phrase to be referred by he, she, it or they pronouns), includes (containing a vector of REs Ids nested in the current RE, possible empty), indefinite (YES if the RE is an indefinite determined NP and NO if the RE is definite determined or undetermined), predicateNameBE (contains the Id of the subject when the current RE is a predicative noun of a form of the predicate to be), predicateNameBECOME (contains the Id of the subject when the current RE is a predicative noun of a form of the predicate to become), apposition (contains the Id of the RE towards whom the current RE is in an apposition relation), SYNOMYMS (the list of the WordNet synonyms of the lemma, no matter the sense), HYPERNYMS (the list of the hypernymic synset Ids in WordNet, no matter the sense), MERONYMS (the list of the has-parts synset Ids in WordNet, no matter the sense), HOLONYMS (idem, for the part-of relations) No syntactic attributes, other than predicative noun and apposition, were retained The knowledge sources were implemented based on the following processors: a POS- HOW TO DEAL WITH WICKED ANAPHORA? tagger, an FDG parser, a very simple name-entity recognizer, and a WordNet navigator The model includes 4 certifying rules, 2 demolishing rules, 5 promoting rules and 5 attribute-filling rules The domain of referential accessibility considered is linear and the anaphors were searched within a distance of 10 sentences for coreference and 3 sentences for functional relations Table 1 shows the dimension of the experiment Table 1 total % nested REs 1097 29 37 coreferential appositions 19 0 51 type-of appositions 9 0 24 coreferential subject-predicative noun relations 40 1 07 type-of subject-predicative noun relations 45 1 20 same-as relations 1 0 03 different lemmas 1115 29 85 group noun to split antecedents 1 0 03 pl noun to split antecedents 4 0 11 pl pron to split antecedents 20 0 54 is-element-of 34 0 91 is-part-of and has-as-part 110 2 95 cataphorae 8 0 21 total REs 5522 total DEs 3107 total relations 3735 The total number of relations was computed by adding the number of coreferential relations (no of REs minus no of DEs) with the number of functional relations The investigated phenomena amounted to 2/3 of the total number of anaphoric relations in the corpus (approx 67%) The rest are genuine coreference relations Nested REs put no problem, because the simple identification of this surface pattern yields a belongs-to relation or a variation of it At this stage of research, no effort was invested to refine among different subtypes DAN CRISTEA AND OANA-DIANA POSTOLACHE By far, the best results (precision and recall between 0 8 and 0 92) are obtained for predicative noun to subject relations, relatively easy to identify and catalogue as either coreference or type-of relations Recognition of type-of relations in case of appositions had also a good degree of success (0 8 precision and 0 88 recall) Bad precision was obtained for appositional coreferences, explained by the tendency of our external sources to classify as appositions also enumerations, rather by inappropriate decisions taken in the resolution process itself Still, a good recall of 0 94 was obtained in these cases The difference in precision is explained by the scarcity of cases were terms of enumerations are expressed as indefinites We obtained very good precision but bad recall in cases of coreferences involving cataphorae Examples of failed resolutions of this type are: 'Do you think you could come across and have a look at our kitchen sink?'… The Parsons' flat was bigger than Winston's (for reasons of a too large distance in between: 218 interposed words in more than 12 sentences), someone … the children (intended disagreement in number), “We didn't ought to 'ave trusted 'em That's what comes of trusting 'em We didn't ought to 'ave trusted the buggers ” (the parser does not recognize 'em as a pronoun), “Take your places, please ” Winston sprang to attention in front of the telescreen ”Take your time by me Come on, comrades ” (where, because of number confusion, the referent of the first occurrence of your is found to be the already existent [ Winston] DE, which will furthermore prevent comrades to corefer it) The only singular group noun to split antecedents example found in the corpus was correctly processed, but an optimistic conclusion here would be premature Of the examined cases, the most frequent are found to be different lemmas coreferences Our implemented model is still too weak to handle properly these anaphoric phenomena: a better similarities-valuing model is needed In approx the same range of precision and recall (32% - 58%) are the results of plural noun and pronoun referring split antecedents as well as the recognition of the is-part-of relations The following are examples of failed plural-noun-to-split-antecedents references: “Of course it's only because Tom isn't home” said Mrs Parsons vaguely The Parsons' flat (failure to discover the second occurrence of Parsons as a plural noun); Oceania was at war with Eurasia and in alliance with Eastasia… the three powers (no WordNet or other name entity help in identifying the names as state names); Winston was dreaming of his mother His father he remembered more vaguely … (Winston remembered …) The two of them must evidently have been HOW TO DEAL WITH WICKED ANAPHORA? swallowed up in one of the first great purges of the fifties (only deep understanding of the context the reference is used could disambiguate the two of them as being the group of [ mother] and [ father] and not of [ Winston] and [ father], for example; also cardinality of groups as a restrictions feature is not yet in the model); At this moment his mother was sitting in some place deep down beneath him, with his young sister in her arms Both of them were looking up at him (both of them could not be linked to the group [ mother] and [ sister] for the same reasons as above; this is also a good example of postponed evaluation, since only later, at the moment of reading him, one could decide that the group does not include also the person referred by him on the ground that a group cannot look at a member of it – WK); an old man and an old woman … “We didn't ought to 'ave trusted 'em“ (153 in between words in 8 sentences, and two more persons mentioned) To note that the current model does not implement group nouns referring split antecedents when the split antecedents are nouns different in number, as in Mary and her friends went to the cinema They saw a good movie Finally, the worst results were obtained for is-element-of relations Here are some commented failures: Victory Mansions were old flats… The Parsons' flat was bigger than Winston's (lack of knowledge sources to recognize elliptical heads of genitival constructions, like Winstons’ ); The sacred principles of Ingsoc Newspeak, doublethink, the mutability of the past (invented terms, not in English, impossibility to consult WN in order to detect is-element-of relations); All their ferocity was turned outwards, against the enemies of the State, against foreigners, traitors, saboteurs, thought-criminals (the still weak capacity of the model to recognize similarity: only WordNet hypernymic chains contribute) An interesting example of failure to recognize the has-as-part relations is the following: Both of them were dressed in the blue shorts, grey shirts, and red neckerchiefs which were the uniform of the Spies Here, which is a pronoun referring the DEs of the group of elements {[ the blue shorts ], [ grey shirts ], [ red neckerchiefs ]} Although number neuter, when seen in isolation, this pronoun was found to be in plural, as the subject of the plural verb were As a result, a new DE was proposed to represent the set of the three elements, and a relation has-as- element linking this DE with each of its members Further on, there is a subject- predicative noun construction with a definite predicative noun: which were the uniform of the Spies, implying therefore a coreference relation This will finally yield has-as-element relations between [ the uniform] and each of its mentioned elements, instead of has-as-part relations (a short, a shirt and a neckerchief can DAN CRISTEA AND OANA-DIANA POSTOLACHE be parts of a uniform, not elements of it) Perhaps WK is needed to correct this error 8 2 Conclusions Modern applications, especially those related to semantic web, compel to apply combined and complex methods in NLP These application environments require more and more sophisticated tools to be put to work and, where necessary, AR methods should be prepared to tackle also hard problems raised by the interpretation of free language The paper investigates cases of difficult AR problems and proposes a set of solutions within the framework of a general incremental AR solver, called AR- engine, previously introduced by Cristea and Dima (2001) The basic principles and architecture of the engine are presented Our investigation went on cases of AR resolution that were not in focus in previous evaluation attempts (Cristea et al , 2002a), (Cristea et al , 2002b), and where the evaluation was conducted on examples chosen by hand or reported by other authors to be difficult to tackle (Mitkov, 2001), (Barbu et al , 2002) This time, a corpus of continuous text taken from the belletrist register was used We investigated four categories of anaphoric relations, that, we believed, display an ascending degree of difficulty: coreference relations whose resolution could be triggered by positional (syntactic) constrains, coreference relations in which the anaphor and the antecedent are common nouns with disagreement in lemma, noun and pronoun anaphors displaying number disagreement with the antecedents, and bridging anaphora For the first time, anaphoric references other than genuine coreferences were experimented with AR- engine Using the framework, we discussed also two less studied situations of recuperation of referential links: the case of cataphoric references and situations when resolution cannot be accomplished synchronously with the reading moment of the anaphor The examples discussed in the paper revealed different degrees of difficulties Consequently, the knowledge sources put on stage were also spread on a very large scale, from cheep, as a POS-tagger, capable to tag words with morphological features, to extremely expensive, like WSD, capable to infer word senses in context (however, our implemented model did not make use of a WSD knowledge source) Due to the difficulty to organize a large corpus annotated for such a large diversity of referential links, the dimension of the experiment was limited The HOW TO DEAL WITH WICKED ANAPHORA? language under investigation was English However, the framework is not restricted to one language Language dependent expertise is incorporated in a model, which is a configurable component that should be plugged-into the engine Also, any application specific behaviour, as for instance the type of references to identify, can be described into the model If infrequent cases require costly implementation solutions and costly computations, the effort is not justified Instead, if a model can be easily updated to take into account also these cases with little difference in computation time, then the effort is worth doing It is also worth questioning whether there exist an algorithmic optimisation solution, such that to call for expensive methods only when other cheaper methods were inefficient To take the coreference task as example, expensive methods would have to be put to work only when cheep methods would have failed to point firmly an antecedent among more closely rated candidates This behaviour can be easily added to the functionality of the AR-engine by adequately exploiting thresholds A disambiguation decision between two candidates is usually taken when their computed scores differentiate on a certain threshold Then, one could make this threshold be larger if computed based on rules alimented by cheap knowledge sources and narrower if computed based on rules alimented by expensive knowledge sources In this stage of the research the interest was focused, on one hand, in enhancing the AR engine and, on the other, in devising rules and heuristics that, integrated into a model, to foreshadow the feasibility of the expected solutions for the specific types of anaphorae enumerated Another goal was to neatly define the benefit that certain knowledge sources can bring for certain types of problems Knowledge sources, as well as resources, should always constitute a configurable component in an AR task A designer should be able to add or to remove to/from an AR engine any such knowledge source depending on their availability, the complexity of the task and the running constraints In such a configurable setting, it should then be clear what behaviour to expect any time a “surgery” of this genre is operated Although it is perhaps too soon to draw conclusions related to the feasibility of the approach, we consider our results promising The engine has reached a certain stability vis-à-vis the updates encumbered by the specific type of processing imposed by a large diversity of anaphoric phenomena The most spectacular part of the research is only now ahead us, when the focus will be on the refinement of the incorporated model DAN CRISTEA AND OANA-DIANA POSTOLACHE Acknowledgments The authors are grateful to our master students in Computational Linguistics, Petronela Dumitra cu, Corina Forăscu-Nicu, Maria Husarciuc, Monica Lupu, Delia Mihu, Daniel Pintilie, and Diana Trandab ăţ -Bala a, who have pursued the annotation task and to two anonymous reviewers for their comments that helped a lot to improve the final shape of the paper References Azzam, Saliha, Kevin Humphreys & Robert Gaizauskas 1998 “Evaluating a Focus- Based Approach to Anaphora Resolution” Proceedings of the 17th Coling and the 36th Annual Meeting of the ACL (COLING-ACL'98), Montreal, Canada, 74-78 Barbu, C ătălina, Richard Evans & Ruslan Mitkov 2002 “A corpus based investigation of morphological disagreement in anaphoric relations” Proceedings of Language Resources and Evaluation Conference - LREC 2002, Las Palmas, vol VI, 1995-1999 Barlow, Michael 1998 “Features Mismatches and Anaphora Resolution” New Approaches to Discourse Anaphora, Technical Papers ed by Simon Botley & Tony McEnery, vol 11 Brennan, Susan E , Marilyn W Friedman & Carl J Pollard 1987 “A Centering Approach to Pronouns” Proceedings of the 25th Annual Meeting of the ACL, Stanford, 157-162 Brill, Eric 1992 “A simple Rule-based Part Of Speech Tagger” Proceedings of The Third Conference of Applied Natural Language Processing, Trento, 152-155 Chomsky, Noan 1981 Lectures on Governement and Binding Dordrecht: The Netherlands Foris Publishers Carden, Guy 1982 “Backwards Anaphora in Discourse Context” Journal of Linguistics 18 361-387 Cristea, Dan, Nancy Ide, & Laurent Romary 1998 Veins Theory: “A Model of h Global Discourse Cohesion and Coherence” Proceedings of the 17t Coling h and the 36t Annual Meeting of the ACL (COLING-ACL'98), Montreal, 281- 285 HOW TO DEAL WITH WICKED ANAPHORA? Cristea, Dan & Gabriela-Eugenia Dima 2001 “An Integrating Framework for Anaphora Resolution” Information Science and Technology, Bucharest 4:3- 4 273-291 Cristea, Dan, Oana-Diana Postolache, Gabriela-Eugenia Dima & C ătălina Barbu 2002a “AR-Engine – a framework for unrestricted coreference resolution” Proceedings of Language Resources and Evaluation Conference - LREC 2002, Las Palmas, vol VI, 2000-2007 Cristea, Dan, Gabriela-Eugenia Dima, Oana-Diana Postolache & Ruslan Mitkov 2002b “Handling complex anaphora resolution cases” Proceedings of the Discourse Anaphora and Anaphor Resolution Colloquium, Lisbon Deemter van, Kees & Rodger Kibble 2000 “On Coreferring: Coreference Annotation in MUC and related schemes” Computational Linguistics 26:4 615-623 Eschenbach, Carola, Christopher Habel, Michael Herweg & Klaus Rehkamper 1989 Proceedings of the 27th Annual Meeting of the Association for Computational Linguistics (ACL ’89), 161-167 Gordon, Peter C & Randall Hendrick 1998 “The Representation and Processing of Coreference in Discourse” Cognitive Science 22 389-424 Grosz, Barbara, Aravind K Joshi & Scott Weinstein 1995 “Centering: a Framework for Modelling the Local Coherence of Discourse” Computational Linguistics 21:2 Halliday, Michael A K & Ruqaiya Hassan 1976 Cohesion in English London & New York: Longman Hirschman, Lynette & Nancy Chinchor 1997 “MUC-7 Coreference Task Definition, version 3 0” MUC-7 Proceedings See also: http://www muc saic com Hirschman, Lynette, Patricia Robinson, John Burger & Marc Villain 1997 “Automatic coreference: The role of annotated training data” Proceedings of AAAI String Symposium on Applying Machine Learning to Discourse Processing Hobbs, Jerry R 1978 “Resolving pronoun references” Lingua 44 Also in Readings in Natural Language Processing, Morgan Kaufmann, Los Altos, 1986 ed by Barbara Grosz, Karen Sparck-Jones & B Webber, 339-352 DAN CRISTEA AND OANA-DIANA POSTOLACHE Järvinen, Timo & Pasi Tapanainen 1997 “A dependency Parser for English” The Technical Reports of the Department of General Linguistics, University of Helsinki Kameyama, Megumi 1997 “Recognizing Referential Links: an Information Extraction Perspective” Proceedings of a Workshop “Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts”, Madrid, 46- 53 Kamp, Hans & Uwe Reyle 1993 From Discourse to Logic Dordrecht: Kluwer Academic Publishers Kennedy, Chris & Branimir Boguraev 1996 “Anaphora for Everyone: Pronominal h Anaphora Resolution without a Parser” Proceedings of the 16t International Conference on Computational Linguistics, vol I, 113-118 Lappin, Y Shalom & Herbert J Leass 1994 “An Algorithm for Pronominal Anaphora Resolution” Computational Linguistics 20:4 535-561 McEnery, Antonio, Izumi Tanaka & Simon Botley 1997 “Corpus annotation and reference resolution” Proceedings of the ACL’97/EACL’97 Workshop on operational factors in practical, robust anaphora resolution, Madrid, 67-74 Markert, Katja, Michael Strube & Udo Hahn 1996 “Inferential Realization Constraints on Functional Anaphora in the Centering Model” Proceedings of the CogSci 1996, 609-614 Miller, George, Richard Beckwith, Christiane Fellbaum, Derek Gross & Katherine Miller 1990 “Introduction to wordnet: An on-line lexical database” International Journal of Lexicography 3:4 235-244 Mitkov, Ruslan 1997 “Factors in Anaphora Resolution: They Are not the Only Things that Matter A Case Study Based on Two Different Approaches” Proceedings of the Workshop "Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts", Universidad Nacional de Educación a Distancia, Madrid ed by Ruslan Mitkov & Branimir Boguraev, 14-21 Mitkov, Ruslan 2001 “Outstanding issues in anaphora resolution” Computational Linguistics and Intelligent Text Processing, ed by Al Gelbukh, 110-125 Berlin: Springer Mitkov, Ruslan 2002 Anaphora resolution Londra: Longman HOW TO DEAL WITH WICKED ANAPHORA? Orăsan, Constantin 2000 “CLinkA a Coreferential Links Annotator” Proceedings of Language Resources and Evaluation Conference - LREC 2000, Athens, 491-496 See also: http://clg wlv ac uk/projects/PALinkA/ Poesio, Massimo, Renata Vieira & Simone Teufel 1997 “Resolving bridging references in unrestricted texts” Proceedings of the Workshop "Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts", Universidad Nacional de Educación a Distancia, Madrid ed by Ruslan Mitkov & Branimir Boguraev Reinhart, Tanya 1981 “Definite NP anaphora and c-command domains” Linguistic Inquiry 6 :12 605-635 Resnik, Philip 1999 “Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language” Journal of Artificial Intelligence Research 11 95-130 Tanaka, Izumi 1999 The Value of an Annotated Corpus in the Investigation of Anaphoric Pronouns, with Particular Reference to Backwards Anaphora in English Phd thesis, University of Lancaster Tufi, Dan 1999 “Tiered Tagging and Combined Classifiers” Text, Speech and Dialogue, Lecture Notes in Artificial Intelligence 1692 ed by Frederick Jelinek & Elmar Nöth Berlin: Springer Tufi, Dan & Dan Cristea 2002a “Methodological issues in building the Romanian Wordnet and consistency checks in Balkanet” Proceedings of the Workshop on Wordnet Structures and Standardization, and how these affect Wordnet Applications and Evaluation, held in conjunction with The Third International Conference on Language Resources and Evaluation, LREC- 2002, Las Palmas Vossen, Piek ed 1998 A Multilingual Database with Lexical Semantic Networks Dordrecht: Kluwer Academic Publishers 